Potential-Based Shaping and Q-Value Initialization are Equivalent

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Potential-Based Shaping and Q-Value Initialization are Equivalent

Shaping has proven to be a powerful but precarious means of improving reinforcement learning performance. Ng, Harada, and Russell (1999) proposed the potential-based shaping algorithm for adding shaping rewards in a way that guarantees the learner will learn optimal behavior. In this note, we prove certain similarities between this shaping algorithm and the initialization step required for seve...

متن کامل

Dynamic potential-based reward shaping

Potential-based reward shaping can significantly improve the time needed to learn an optimal policy and, in multiagent systems, the performance of the final joint-policy. It has been proven to not alter the optimal policy of an agent learning alone or the Nash equilibria of multiple agents learning together. However, a limitation of existing proofs is the assumption that the potential of a stat...

متن کامل

q-Markov covariance equivalent realizations

متن کامل

Potential-based Shaping in Model-based Reinforcement Learning

Potential-based shaping was designed as a way of introducing background knowledge into model-free reinforcement-learning algorithms. By identifying states that are likely to have high value, this approach can decrease experience complexity—the number of trials needed to find near-optimal behavior. An orthogonal way of decreasing experience complexity is to use a model-based learning approach, b...

متن کامل

Being right on Q: shaping eukaryotic evolution

Reactive oxygen species (ROS) formation by mitochondria is an incompletely understood eukaryotic process. I proposed a kinetic model [BioEssays (2011) 33: , 88-94] in which the ratio between electrons entering the respiratory chain via FADH2 or NADH (the F/N ratio) is a crucial determinant of ROS formation. During glucose breakdown, the ratio is low, while during fatty acid breakdown, the ratio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Artificial Intelligence Research

سال: 2003

ISSN: 1076-9757

DOI: 10.1613/jair.1190